BEST: Exploratory analysis

The BEST project aims to understand what are the behavioural responses of people who rely on natural resources for their livelihood, and whose resources could potentially present threshold like dynamics, aka. “regime shifts”. For this study we designed a game played with fishermen in four communities of the Colombian caribbean. Here I present a short exploration of the game data and survey.

Where?

The game was performed in four coastal communities of the Colombian Caribbean. Taganga is traditional fishing town that in the last decades has become more of a turistic hub for backpackers. Fishing is still important in the area and is mainly done in the sea, sometimes on the marine territory of Tayrona National Park. Tasajeras is a small fishing community located on the road that communicates Santa Marta and Barranquilla, two major cities. Their location favors fishing both in the sea and inside the Cienaga Grande de Santa Marta, a wetland complex of brakish water, mangroves and a very special ecosystem that depends on the exchange of salt water from the sea and fresh water from the Magdalena river. The inhabitants of Buenavista literally live on the Cienaga. Their houses stand just above water on an area of the Cienaga that is flooded all year round. They are the poorest community we visited and they depend completely on explotation of natural resources for survival. Las Flores, in contrast, is actually a neighbourhood of Barranquilla located in the outskirts of the city. They fish on the Magdalena river’s mouth with very special techniques that use wind and kites to fish on the sea. They also exploit resources in neigbouring wetlands formed by the delta.

We designed the experiment to test for differences on treatment, but also to test for difference on ecological conditions, expecting that communities who rely more strongly on the Cienaga (a closed system that has undergone regime shifts in the past) would behave differently that communities exposed more to oceanic conditions; where variability has perhaps masked/buffered regime shifts.

load('~/Documents/Projects/BEST - Beijer/BEST/mapBEST.RData')
ggmap(map) + 
  geom_point(aes(x=lon, y=lat, colour='Orange', size=2), data=coords, alpha=0.5, 
             show.legend = F) +
  geom_text(aes(x=lon, y=lat), data= coords ,
            label=c('Taganga', 'Tasajeras', 'Buenavista', 'Las Flores'), 
            size=4, colour='blue', nudge_y = 0.03) + ggtitle('Study area')

A graphical exploration of the survey

We ran a survey with each of the 256 participants of our fishing game. The 56 questions of the survey are grouped around five topics:

  1. the game
  2. fishing activities
  3. changes in resources and traditional ecological knowledge
  4. cooperation and community activities
  5. demographic information at household level

For each section I present a series of graphic summaries for questions whose answer was binary or numeric. Later I elaborate on more detailed information for text data. The version of this graphical summary is a lazy version. I’ve written a function that works on any question of the survey but does not modified the x-axis labels for any figure. This allow us to identify preliminary problems with data, e.g. wether it has been errors on digitalizing the raw data. However, unless is explicitely said on the figure label, 0 means no and 1 means yes.

1. The game

Note that Q3 has an error, it should be binary 1/0 and has values of 2. Most people was not expecting the end of the game and wanted to play at least 5 more rounds. That’s good as control for end of game effect.

q1 <- question (surv, q1=372, q2=25, q3=2, fun=mean) + ggtitle('Q2. Life satisfaction\n 1= very satisfied : 4 = very unsatisfied')
q2 <- question (surv, q1=372, q2=26, q3 = 2, fun=mean) + ggtitle('Q3. Have you participated in economic\n experiments before?')
q3 <- question (surv, q1=372, q2=27, q3 = 2, fun=mean)  + ggtitle('Q4. Did you played with any of your\n fishing partners?')
q4 <- question (surv, q1=372, q2=28, q3 = 2, fun=mean)  + ggtitle('Q5. Were you surprised at the end?')
q5 <- question (surv, q1=372, q2=29, q3 = 2, fun=mean)  + ggtitle('Q6. How many extra rounds were you expecting?\n 0 = none; 1 = <5; 2 = >5')
grid.arrange(q1,q2,q3,q4,q5, nrow=1, ncol=5)

2. Fishing activities

Most people started fishing as teenager, however it varies depending of place. For future analysis a density plot might better reveal such differences. We have errors on data input for questions 8 & 10. Most people have been fishermen all then time since they started, and most of them have also been fishing on the same community. For less than half of our sample, people stop fishing at least a month last year (2015)

grid.arrange(
  question (surv, q1=372, q2=31, q3 = 2, fun=mean)  + ggtitle('7. How old did you start fishing?'),
  question (surv, q1=372, q2=32, q3 = 2, fun=mean)  + ggtitle('8. Do you fish most of the time?'),
  question (surv, q1=372, q2=33, q3 = 2, fun=mean)  + ggtitle('9. Have you been fishing here\n since you started?'),
  question (surv, q1=372, q2=35, q3 = 2, fun=mean)  + ggtitle('10. Last year, there were months\n when you have not fished?'),
  nrow=1, ncol=4
)

I’d like to get a cleaner version of the fishing calendar (e.g. a heatmap), but for now the figure below summarizes when were people non-fishing last year. Note is a crappy graph, not all plots are on the same scale… but aesthetics apart, one can see that not surprisingly the month when people stop fishing more often is December, probably also due to festivities.

grid.arrange(
  question (surv, q1=372, q2=36, q3 = 2, fun=mean)  + ggtitle('January'),
  question (surv, q1=372, q2=37, q3 = 2, fun=mean)  + ggtitle('Februrary'),
  question (surv, q1=372, q2=38, q3 = 2, fun=mean)  + ggtitle('March'),
  question (surv, q1=372, q2=39, q3 = 2, fun=mean)  + ggtitle('April'),
  question (surv, q1=372, q2=40, q3 = 2, fun=mean)  + ggtitle('May'),
  question (surv, q1=372, q2=41, q3 = 2, fun=mean)  + ggtitle('June'),
  question (surv, q1=372, q2=42, q3 = 2, fun=mean)  + ggtitle('July'),
  question (surv, q1=372, q2=43, q3 = 2, fun=mean)  + ggtitle('August'),
  question (surv, q1=372, q2=44, q3 = 2, fun=mean)  + ggtitle('September'),
  question (surv, q1=372, q2=45, q3 = 2, fun=mean)  + ggtitle('October'),
  question (surv, q1=372, q2=46, q3 = 2, fun=mean)  + ggtitle('November'),
  question (surv, q1=372, q2=47, q3 = 2, fun=mean)  + ggtitle('December'),
  nrow=2, ncol=6
)

The questions about fishing effort are not well coded. In question 16 a & b both . and , were used to designate decimals which results on errors and not plot shown. We also have an outlier regarding earning in Colombian pesos, who said that on a bad day will make 3 * 106 which is highly unrealistic. Errors aside, most people fish 6 days a week and the most common fishing day is of 12 hours, followed by working days of 10 and 8 hours. Although the graphs for earnings are less than suitable here (re-scale and use density instead of geom_bar) the minimum daily wage on an average day is Col$ 500 and the median is Col$ 310^{4} which is less than US$10. On a bad day most people don’t earn anything and the median is 2000 which is less than 10Sek. Most people have bad days once or more than once a week.

grid.arrange(
  # repeat graph with position = 'dodge' in geom_bar
  question (surv, q1=372, q2=49, q3 = 2, fun=mean)  + ggtitle('14. Days fishing in a normal week'), 
  question (surv, q1=372, q2=50, q3 = 2, fun=mean)  + ggtitle('15. Fishing hours in a normal day'),
  #question (surv, q1=372, q2=51, q3 = 2, fun=mean)  + ggtitle('16a. Kg of fish in a normal day'), ## error!!! summary(surv[,51]) reveals there is . and , used for decimals
  #question (surv, q1=372, q2=52, q3 = 2, fun=mean)  + ggtitle('16b. Earnings in Col$ in a normal day'), ## error summary(surv[,52]) # one of the datapoins is 3 million pesos per day!
  question (surv, q1=372, q2=53, q3 = 2, fun=mean)  + ggtitle('16c. Kg of fish in a good day'), # error
  question (surv, q1=372, q2=54, q3 = 2, fun=mean) 
    + geom_histogram(aes(alpha=0.2)) + ggtitle('16d. Earnings Col$ in a good day'), # error
  question (surv, q1=372, q2=55, q3 = 2, fun=mean) + ggtitle('16e. Kg of fish in a bad day'), # error
  question (surv, q1=372, q2=56, q3 = 2, fun=mean) + geom_histogram(aes(alpha=0.2)) + ggtitle('16f. Earnings Col$ in a bad day'), # error in distribution table(surv[,56])/16
  question (surv, q1=372, q2=57, q3 = 2, fun=mean)  
    + ggtitle('16f. How often do you have a bad day?\n 1 = 1/year, 2 = 1/month, 3 = 1/week, 4 = n/week'),
  nrow=2, ncol=4
)

Most people fish in company of other fishermen, but almost a fifth of our surveyed fishermen fish alone. Crew size varies across places, very few people fish with crews > 12, and they are from Taganga or Las Flores, the fishing towns where open sea fishing with more industrial styles are possible. In fact, the maximum crew size for Tasajera is 5 and in Buenavista is 3 or 4. Note we have an error on the raw data for Buenavista datapoints. Most people fish with the same crew, but most of the time the decision on when to go fishing is made by other person than the intervieweed. The most typical boat is bote which implies a small size vessel without engine or with a small power engine. Only few people (<20) fish on lanchas which are bigger boats with more powerful engines, most of them are located in Las Flores where wind and currents are strong due to the Magdalena river delta. Similarly, around 60% of fishermen are not the captain neither own the boat they fish with. They don’t fish on the same places either. Half of the people own their fishing art being ownership most common in Buenavista and less common in Taganga.

grid.arrange(
  question (surv, q1=372, q2=58, q3 = 2, fun=mean)  + ggtitle('18. Do you fish with someone\n else?'),
  question (surv, q1=372, q2=59, q3 = 2, fun=mean)  + ggtitle('18.1 How often?\n 1=rare, 2=1/2times, 3=most, 4=always'),
  question (surv, q1=372, q2=60, q3 = 2, fun=mean)  + ggtitle('18.2 How many are you?'), # repeat this plot with density table(surv[,60])/16
  question (surv, q1=372, q2=61, q3 = 2, fun=mean)  + ggtitle('18.3 Same crew?'),
  question (surv, q1=372, q2=62, q3 = 2, fun=mean)  + ggtitle('18.4 Who decides when to fish?\n 0=me 1=some else'),
  question (surv, q1=372, q2=64, q3 = 2, fun=mean)  + ggtitle('19. Do you fish by boat\n or lancha?'),
  question (surv, q1=372, q2=65, q3 = 2, fun=mean)  + ggtitle('19.1 Are you the captain?'),
  question (surv, q1=372, q2=66, q3 = 2, fun=mean)  + ggtitle('19.2 Do you own the boat?'),
  question (surv, q1=372, q2=67, q3 = 2, fun=mean)  + ggtitle('20. Do you fish in the same place?'),
  question (surv, q1=372, q2=89, q3 = 2, fun=mean)  + ggtitle('23. Do you own the fishing art?'),
  nrow=2, ncol=5
)

Where does the food goes? Most fishermen take some fish home but they usually sell more than half and ocassionally give some away. When asked if they imagine themeselves being fishermen in 10 years most people think they will be. Question 28 is horribly coded, it should be between 0:4 but has values >5 up to 10! There is also typing errors in Tasajera where many values were included in answers that should have been unique.

grid.arrange(
  question (surv, q1=372, q2=90, q3 = 2, fun=mean)  + ggtitle('24. How much do you take homes?\n 0=none, 1=some, 2=half, 3=>half, 4=all'),
  question (surv, q1=372, q2=91, q3 = 2, fun=mean)  + ggtitle('25. How much do you sell?\n 0=none, 1=some, 2=half, 3=>half, 4=all'),
  question (surv, q1=372, q2=93, q3 = 2, fun=mean)  + ggtitle('26. How much do you give away?\n 0=none, 1=some, 2=half, 3=>half, 4=all'),
  question (surv, q1=372, q2=93, q3 = 2, fun=mean)  + ggtitle('27. Do you think you will be fishermen in 10yrs?\n 0=NO, 1=no, 2=yes, 3=YES, 4=dont know'),
  question (surv, q1=372, q2=96, q3 = 2, fun=mean)  + ggtitle('28. Do you think your children will fish?\n 0=NO, 1=no, 2=yes, 3=YES, 4=dont know'),
  nrow=1, ncol=5
)

3. Changes in resources & TEK

When we asked about species dynamics people reported that there has been speces that they don’t fish as much as before. More than 150 respondants report this changes as dramatic. However, this part of the survey is horribly coded. There are errors in data for Tasajera for questions about when the species dissapeared or if is still missing, values should have been numeric but it was not coded that way. Errors lead to no plots produced.

grid.arrange(
  question (surv, q1=372, q2=98, q3 = 2, fun=mean)  + ggtitle('31. Have you been fishing the same spp?'), # should be binary
  question (surv, q1=372, q2=100, q3 = 2, fun=mean)  + ggtitle('33. Is there any species that you dont fish as much as before?'), # error
  question (surv, q1=372, q2=102, q3 = 2, fun=mean)  + ggtitle('35. Dramatic changes?'), # error
  question (surv, q1=372, q2=104, q3 = 2, fun=mean)  + ggtitle('35.3. Sp1 When?'), ### Errors, horribly coded! summary(surv[,104])
  question (surv, q1=372, q2=105, q3 = 2, fun=mean)  + ggtitle('35.4. Sp1 How long, still missing?'),
  question (surv, q1=372, q2=109, q3 = 2, fun=mean)  + ggtitle('35.3. Sp2 When?'),
  question (surv, q1=372, q2=110, q3 = 2, fun=mean)  + ggtitle('35.4. Sp2 How long, still missing?'),
  question (surv, q1=372, q2=114, q3 = 2, fun=mean)  + ggtitle('35.3. Sp3 When?'),
  question (surv, q1=372, q2=115, q3 = 2, fun=mean)  + ggtitle('35.4. Sp3 How long, still missing?'),
  question (surv, q1=372, q2=119, q3 = 2, fun=mean)  + ggtitle('35.3. Sp4 When?'),
  question (surv, q1=372, q2=120, q3 = 2, fun=mean)  + ggtitle('35.4. Sp4 How long, still missing?'),
  nrow=2, ncol=6
)

Responses on abrupt events on fishing: In question 35 people were asked if they have experienced an abrupt change (above graph). Questions 36:40 were only asked to people who did experience abrupt changes, so N != 256. People were asked if the abrupt event described on table/question 35 have affected what they fish. Most people answer positively, nevertheless about 40 people say ‘no’ (Q36 error with 10 in Buenavista where it should have been 1/0). Since the occurrance of the event, most people spend more time fishing, less than 20 spends the same or less time respectively. Just above 60 people changed their fishing area while just below 50 didn’t (Q36.3 error non-binary value in Las Flores). Most people expected abrupt changes in the future. Question 39 was a multiple choice answer type, but only to ~50 respondents were read the options out loud. In case of an abrupt event fewer people would continue fishing, most of them won’t increase effort, but they won’t reduce their fishing time neither, less than 40 would consider changing fishing grounds or fishing arts, and most of them wont stop fishing. A lot of questions in this section have coding errors for data in Taganga with contradictory values for the same individual, or with values of 2 when it should be 1/0.

grid.arrange(
  question (surv, q1=372, q2=123, q3 = 2, fun=mean)  + ggtitle('36. Did the dramatic event\n change your catch?'), # error with a 10 in Buenavista
  question (surv, q1=372, q2=125, q3 = 2, fun=mean)  + ggtitle('36.2 Do you spend more time fishing\n 0=less, 1=more, 2=same?'),
  question (surv, q1=372, q2=126, q3 = 2, fun=mean)  + ggtitle('36.3 Did you changed your\n fishing area?'),
  question (surv, q1=372, q2=127, q3 = 2, fun=mean)  + ggtitle('37. Do you expect other changes?'),
  question (surv, q1=372, q2=129, q3 = 2, fun=mean)  + ggtitle('39.1 What would you do if there\n is abrupt reduction of fish\n Where the question options read out loud?'),
  question (surv, q1=372, q2=130, q3 = 2, fun=mean)  + ggtitle('39.2 Continue fishing'),
  question (surv, q1=372, q2=132, q3 = 2, fun=mean)  + ggtitle('39.3 Increase effort'), # error in taganga
  question (surv, q1=372, q2=134, q3 = 2, fun=mean)  + ggtitle('39.4 Less fishing'),# error in taganga
  question (surv, q1=372, q2=136, q3 = 2, fun=mean)  + ggtitle('39.5 Change fishing area'),# error in taganga
  question (surv, q1=372, q2=138, q3 = 2, fun=mean)  + ggtitle('39.6 Change fishing art'), # error taganga
  question (surv, q1=372, q2=140, q3 = 2, fun=mean)  + ggtitle('39.7 No more fishing'), # not binary
  nrow=2, ncol=6
)

4. Cooperation and community

Most fishermen shared their fishing arts, only ~100 belong to a fishing coop. The community with stronger coop presence is Las Flores (remember the local contact was the coop leader!) while the community with less cooperative participation is Buenavista. People participating in coops usually meet once a month followed by twice a year. Surprisingly, most participants self-report to be treasurers (but there is only one treasurer per coop!?). The graphs about the reasons to belong to a coop doesn’t make much sense. Future versions should compare (bars) for positive answers, here I’m not sure if negative was coded as default or that they did not perceive the benefit. For example, for ‘better prices’ only 20 people report that benefit while ~80 said ‘no’ to that question. Here 0 means that they don’t get the benefit, or that the option was not mentioned? In any case, the most popular positive answers were about recognition as organizaton member and social aspects. Errors: In Q40 is not binary, error in Taganga data; Q42.3 has errors due to coding, for questions where time was asked interviewers were not consistent on the way they wrote down answers. Some times they used years, months, numbers, text and for now we cannot extract the info automatically.

grid.arrange(
  question (surv, q1=372, q2=143, q3 = 2, fun=mean)  + ggtitle('40. Share fishing art'), # error not binary
  question (surv, q1=372, q2=145, q3 = 2, fun=mean)  + ggtitle('42. Belongs to coop?'),
  question (surv, q1=372, q2=147, q3 = 2, fun=mean)  + ggtitle('42.2 How often do you meet?\n 1=1/yr, 2=2/yr, 3=1/month, 4=1/week'), # not binary
  question (surv, q1=372, q2=148, q3 = 2, fun=mean)  + ggtitle('42.3 Since when?'), # error due to coding summary(surv[,148])
  question (surv, q1=372, q2=149, q3 = 2, fun=mean)  + ggtitle('42.4 What is your role?\n 1=representant, 2=president, 3=secretary, \n 4=treasurer, 5=member, 6=other'),
  question (surv, q1=372, q2=152, q3 = 2, fun=mean)  + ggtitle('42.5.1. options read out loud?') ,
  question (surv, q1=372, q2=153, q3 = 2, fun=mean)  + ggtitle('42. Better prices'),
  question (surv, q1=372, q2=154, q3 = 2, fun=mean)  + ggtitle('42. Conservation of fish resource'),
  question (surv, q1=372, q2=155, q3 = 2, fun=mean)  + ggtitle('42. Support if low income'),
  question (surv, q1=372, q2=156, q3 = 2, fun=mean)  + ggtitle('42. Recognition as org member'),
  question (surv, q1=372, q2=157, q3 = 2, fun=mean)  + ggtitle('42. was pushed'),
  question (surv, q1=372, q2=158, q3 = 2, fun=mean)  + ggtitle('42. social aspects'),
  question (surv, q1=372, q2=159, q3 = 2, fun=mean)  + ggtitle('42. increase fishing & income'),
  question (surv, q1=372, q2=160, q3 = 2, fun=mean)  + ggtitle('42. Better quality of fishing products'),
  nrow=3, ncol=5
)

5. Demographics

Almost all our game participants were male fishermen, most of them married or in co-habitation with partner. The average age for the full sample is 42.7351779 and the median is 42. Our sample is younger in Taganga and older in Las Flores. Most people finished elementary school, over 25 respondents didn’t have any formal education, and less than 20 had university degrees almost all of them from Taganga (There is an error with 0 - datapoint in Tasajeras). Most respondants were born on the community but ~70 of them came from somewhere else (error data point with 10). About 40 people have been moving around while <200 have always live in the same community.


grid.arrange(
  question (surv, q1=372, q2=161, q3 = 2, fun=mean)  + ggtitle('43. Gender \n Female=0'),
  question (surv, q1=372, q2=162, q3 = 2, fun=mean)  + ggtitle('44. Status\n Single=1, married=2, sambo=3, div/widow=4'),
  question (surv, q1=372, q2=163, q3 = 2, fun=mean)  + ggtitle('45. Age') ,
  question (surv, q1=372, q2=164, q3 = 2, fun=mean)  + ggtitle('46. Formal education\n none=1, elementary=2, high=3, uni=4'),
  question (surv, q1=372, q2=165, q3 = 2, fun=mean)  + ggtitle('46.1 Years of education'), # error on coding summary (surv[, 165])
  question (surv, q1=372, q2=166, q3 = 2, fun=mean)  + ggtitle('47. Born here?') , # error non binary
  question (surv, q1=372, q2=168, q3 = 2, fun=mean)  + ggtitle('49. Since when do you live here?'), # error coding summary (surv[, 168])
  question (surv, q1=372, q2=169, q3 = 2, fun=mean)  + ggtitle('50. Have you always lived here?'),
  question (surv, q1=372, q2=170, q3 = 2, fun=mean)  + ggtitle('51. How long have you\n been living here?'), #error coding
  question (surv, q1=372, q2=171, q3 = 2, fun=mean)  + ggtitle('51. Displaced?'), # error coding summary(surv[,171])

  nrow=2, ncol=6
)

When asking if they have suffered forced displacement the answers were coding on diverse formats and we cannot plot the info at the moment. However, few respondents (n=5) report forced displacement from 6 months up to 7 years. Question 52 was designed to assess sense of place. Each participant was asked to agree or disagree on a scale from 1 to 4 (1=strongly agree, 2=agree, 3=disagree, 4 strongly disagree) with statements that were read outloud for them. The figure below summarize their answers per location but I’m not quite sure how to interpret them.

grid.arrange(
  
  question (surv, q1=372, q2=172, q3 = 2, fun=mean)  + ggtitle('52.1 Miss the place'),
  question (surv, q1=372, q2=173, q3 = 2, fun=mean)  + ggtitle('52.2 Not belonging'),
  question (surv, q1=372, q2=174, q3 = 2, fun=mean)  + ggtitle('52.3 Safety'),
  question (surv, q1=372, q2=175, q3 = 2, fun=mean)  + ggtitle('52.4 Proud'),
  question (surv, q1=372, q2=176, q3 = 2, fun=mean)  + ggtitle('52.5 Part of me'),
  question (surv, q1=372, q2=177, q3 = 2, fun=mean)  + ggtitle('52.6 go away'),
  question (surv, q1=372, q2=178, q3 = 2, fun=mean)  + ggtitle('52.7 engaged'),
  question (surv, q1=372, q2=179, q3 = 2, fun=mean)  + ggtitle('52.8 my roots'),
  question (surv, q1=372, q2=180, q3 = 2, fun=mean)  + ggtitle('52.9 family and friends'),
  nrow=3, ncol=3
)

The game

Now I turn your attention to the game data. On groups of 4 we played with the 256 fishermen a fishing game, where each person were supposed to make a decision on how much to fish. Each group of 4 fishermen shared a common fish stock that started with 50 fish. Their decisions were private but they could communicate throghout the game. The game lasted 16 rounds unless they collapsed their resource, of which 6 rounds where for learning the game (‘base line within treatments’) and another 10 rounds they had to face one of four possible situations or treatments. Base line groups continued playing the same game as the first 6 rounds. Threshold treatment players were informed that a climate event has occurred decreasing the reproduction rate of the resource. Risk treatment players faced a similar situation but the event was not certain, they knew that the probability of the event happening was of 50% chance. Uncertainty treatment also faced the possibility of a climate event on the second part of the game but players didn’t know the actual probability of the event. The figure below shows a summary of all time lines of the game per place and per treatment. The blue line shows the mean of the time series while the shaded area shows the smooth, this is the Gaussian confidence intervals based on the t-distribution (so we assume normality). The red line marks round 6, this is the end of the first part of the game. It helps to caution the reader that in round 7 the treatment started and we reset the fish stock size to 50 as in the beginning of the game, that’s why in all timelines there is a ‘jump’ from round 6 to 7.

dat <- read.csv(file="~/Dropbox/BEST/Colombia/0_Game data/160427_corrected_full_data_long.csv", row.names=1)

g <- ggplot(dat=dat, aes(x=Round, y=value)) + 
        geom_vline( aes( xintercept=6, color='red', alpha=0.1), show.legend = F) + 
        stat_summary(fun.data='mean_cl_normal', geom='smooth') + # option 'mean_cl_boot I like the most but normal assumes normality
        facet_grid(Treatment ~ Place) 
g

Now imagine each individual is exploring the parameter space of the stock size at the beginning of each round (which depends on what the group did on the previous round), and her/his own decision. Each individual will then follow a trajectory on that phase diagram that potentially capture a signature of her/his strategy. The figure below shows such trajectory, where each line represents the time series of each person for the afternoon session in Las Flores on February 4th. We played simultaneously with 16 players, 4 in each of the 4 above mentioned treatments. Here we see that the base line group collapsed twice, once on round 5 and another in round 11; this is why we see the line broken and never reaching the light blue colour that denotes the end of the game. Here it also comes clear that, for example, people who played the threshold treatment collectively chose to stay on high levels of the resource by fishing maximum 4 units and maintaining their resource above 38 fishes.

datP1 <- dat %>%
  filter(dat$Place == 'Las Flores' & dat$Session == 'pm' & Date == "2016-02-04" ) %>%
  gdata::drop.levels()

p <- ggplot(data=datP1, aes(x=value, y=StockSizeBegining), group=Player) + 
        geom_path(aes(color=Round)) + facet_grid(Treatment~Player) +
        ggtitle('Phase space trajectory per player\n Las Flores, 2016-02-04-pm ') + 
        theme(text= element_text(family='Helvetica', size=9))

p

At the group level, the same rationale can be applied if one plots the stock size at the begining of each round against the sum of the total catch for all 4 participants per group. Note that we have in total 64 groups (4 treatments * 4 sessions * 4 locations). The figure below summarizes all 64 groups, each one represented with a colour, and each colour showing the contour or the parameter space where most points were concentrated by location and treatment. The contour was calculated with a 2-dimensional smoother (bandwidth = 15). The figure on the left shows the aggregated group behaviour by ploting the sum of the total catch. The figure on the right shows the resource dynamics at the beginning of each round vs. the resource at the end of the round which is the same as the new stock size for the next round, this is what people fish plus the reproduction in the round. In dynamical systems this is equivalent to \(f(x[t+1]):x[t]\).

c <- ggplot(data=dat, aes(y=StockSizeBegining, x=SumTotalCatch), group=Session) + 
  stat_density_2d(aes(color=group, alpha=0.5), n=100, h=15, show.legend = F) + 
  facet_grid( Treatment ~ Place)+ 
    theme(text= element_text(family='Helvetica', size=9))
p <- ggplot(data=dat, aes(y=NewStockSize, x=StockSizeBegining), group=Session) + 
  geom_abline(aes(color='black', alpha=0.5),slope=1, intercept = 0) +
  stat_density_2d(aes(color=group, alpha=0.3), n=100, h=15, show.legend = F) + 
  geom_point(aes(color=group, alpha=0.3), show.legend=F) + 
  facet_grid( Treatment ~ Place)+ 
    theme(text= element_text(family='Helvetica', size=9))

grid.arrange ( c,p, ncol=2, nrow=1 )

The next natural question to ask is whether there is an effect between:

  • The first and the second part of the game
  • Treatments
  • Place

The four panels below try to get to these differences

pm1 <- ggpairs(data= filter(dat, dat$part == F), 
              columns=c('StockSizeBegining', 'NewStockSize', 'Treatment','Place'), 
              upper= list(continuous='density'), lower=list(continuous='points'), 
              mapping=aes(color= Place, alpha=0.5), title='Color by place, first part') 

pm2 <- ggpairs(data= filter(dat, dat$part == T), 
              columns=c('StockSizeBegining', 'NewStockSize', 'Treatment','Place'), 
              upper= list(continuous='density'), lower=list(continuous='points'), 
              mapping=aes(color= Place, alpha=0.5), title='Color by place, second part') 

pm3 <- ggpairs(data= filter(dat, dat$part == F), 
              columns=c('StockSizeBegining', 'NewStockSize', 'Treatment','Place'), 
              upper= list(continuous='density'), lower=list(continuous='points'), 
              mapping=aes(color= Treatment, alpha=0.5), title='Color by treatment, second part') 

pm4 <- ggpairs(data= filter(dat, dat$part == T), 
              columns=c('StockSizeBegining', 'NewStockSize', 'Treatment','Place'), 
              upper= list(continuous='density'), lower=list(continuous='points'), 
              mapping=aes(color= Treatment, alpha=0.5), title='Color by treatment, second part') 

#grid.arrange(
  pm1   + theme(text= element_text(family='Helvetica', size=6))
  pm2   + theme(text= element_text(family='Helvetica', size=6))
  pm3   + theme(text= element_text(family='Helvetica', size=6))
  pm4   + theme(text= element_text(family='Helvetica', size=6))
#  ncol=2, nrow=2
#)

Something I’d really like to get into (for modeling purposes) is identifying fishing strategies or styles. To do so I ran a PCA and MDS on the time series (16 rounds) of each of the 256 players in our dataset. The results are shown below.

# library(vegan)
# players <- (reshape::cast(dat, ID_player ~ Round))[,-1] #delete playersID
# place <- reshape::cast(dat, ID_player ~  Place)[,-1] # 16 because calculate length of 'value'
# group <- reshape::cast(dat, ID_player ~  group)[,-1]
# treat <- reshape::cast(dat, ID_player ~  Treatment)[,-1]
# context <- cbind(place,treat) # don't use group yet, maybe for aes
# 
# pca <- rda(players, context) #, context
# 
# # note: I ran mds with 'euclidean' distance and 'manhattan'. Euclidean gave an error saying there is not enough data. manhattan calculated but didn't reach convergence. Morisita, jaccard, kulczynski, horn, rau, don't reach convergence neither. Frequent error on different measures: Stress is (nearly) zero - you may have insufficient data. It does work however if I use the transpose. It calculates the ordering of rounds on mds$points, but the ordering of players can be found too at mds$species
# mds <- metaMDS(players, dist= 'manhattan', trymax=400, autotransform=F, k=2)
# # mds2 <- metaMDS(players, dist= 'mahalanobis', trymax=1000, autotransform=F, k=3, previous.best = mds) # solution from http://stackoverflow.com/questions/14434794/no-stable-solution-using-metamds-in-vegan
# ef1 <- envfit(mds, context, permu=999)
# 
# par(mfrow=c(1,3))
#   plot(pca)
#   plot(mds, type='p', display=c('sites', 'species'), cex=0.8)
#   plot(ef1, p.max=0.05, col='blue', cex=0.8)
#   stressplot(mds)
    

To-Do

  1. ggpairs plots for treatments, place, and before/after
  2. statistics: t-test, Mann-Witness test, Fisher exact test for all
  3. some simple regressions
  4. styles clustering
# g <- ggplot(dat, aes(x=value, y=StockSizeBegining,color=Treatment )) +
#       geom_point() +
#       geom_smooth( method='loess') +
#       ggtitle('Local polynomial regression\n Stock size vs Decision')
# g